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INTRODUCTION 

The  subject  of  this  report  is  a  relatively  new  topic  in  probability  and 
statistics.  Most  of  the  papers  dealing  with  statistical  inference  in  Markov 
chains  appeared  during  the  past  fifteen  years.  Markov  chains  are  defined  in 
standard  texts  on  probability  and  stochastic  processes  such  as  Feller  (1957) 
and  Parzen  (1961). 

Anderson  (1957)  studied  statistical  inference  in  Markov  chains  for  the 
case  of  repeated  observations  on  the  same  chain.   In  this  circumstance,  each 
observation  is  a  sequence  of  states,  over  a  finite  number  T  of  time  points, 
from  a  Markov  chain  with  common  transition  probability  matrices  Pfc  »(p, .(t)|, 
and  n.(0)  observations  are  in  state  i  at  time  zero.   Under  the  assumption 
that  n.(0)-»oo,  he  presented  likelihood  ratio  tests  for  the  following  hy- 
potheses: (a)  Pfc  is  stationary  (i.e.,  Pfc  -  P  -{pj.})  against  the  alternative 
that  it  varies  over  time,  (b)  P  is  a  given  matrix  against  the  alternative 
that  it  is  not,  (c)  the  process  is  lst-order  against  the  alternative  that  it 
is  2nd-order. 

Anderson  and  Goodman  (1957)  presented  %    -tests  for  goodness  of  fit 
which  are  analogous  to  the  likelihood  ratio  test  for  the  three  hypothesis 
just  stated  and  for  the  following:  (d)  the  process  is  rth-order  against  the 
alternative  that  it  is  (r+l)st-order.   (f)  the  s  samples  of  observations  are 
samples  from  the  same  Markov  chain  P. 

Problems  of  estimation  of  transition  probabilities,  testing  of  goodness 
of  fit  and  order  of  a  chain  were  studied  by  3artlett  (1951)  and  Hoel  (1954) 
for  the  situation  where  only  a  single  sequence  of  states  is  observed.   They 
considered  the  asymptotic  theory  as  the  number  of  time  points  increase. 

The  work  of  Anderson  and  Goodman  (1957)  has  promoted  application  of  the 


theory  of  Markov  chains  in  a  number  of  different  disciplines  and  it  is  their 
work  which  will  be  reported  in  some  detail. 

An  example  by  Anderson  (1954)  in  which  he  applied  statistical  methods 
to  the  problem  of  studying  voter  intentions  introduced  repeated  Markov 
chains  to  social  scientists.   Gabriel  and  Neumann  (1957)  and  Feyerherm 
and  Bark  (1965)  have  applied  the  theory  to  the  study  of  precipitation  patterns. 


ESTIMATION  OF  THE  PARAMETERS  OF  A  1ST-ORDER  MARKOV  CHAIN 

Definition  and  Notation 

Consider  a  sequence  of  observations  in  which  each  observation  can  be  in 
any  one  of  m  distinct  states  at  a  discrete  time  point  t.   Let  p,,(t) 
(i,j  -  l,2,»««,m;  t  -  1,2,«»«,T)  be  the  probability  of  state  j  at  time  t, 
given  state  i  at  time  (t-1).   The  transition  probability  matrices  are  defined 
by 

Pt-(Pij(t)}  (1.1) 


where 


(1)     p  (t)  ^  0  ;  for  all  (i,j)  and  t, 


m 


(2)  I  PH(t)  -  1  ;  for  all  i,  and  t, 
j-1   J 

(3)  Pij^)  "  ^  Pik<t')  Pkj(t)  5  for  any  times  tt'JO 
and  states  i  and  j 

The  probability  law  of  a  homogeneous  Markov  chain  P  is  completely  deter- 
mined once  one  knows  the  transition  probability  matrices  given  by 
P  »  I  PijCt)  }  and  the  unconditional  probability  vector  p(0)  -  j  p  (0)  >  at  time 
zero  (see  Parzen,  1962,  p. 196). 


Model 


Assume  that  there  are  n.  (0)  individuals  in  state  i  at  t«=0.   An  observa- 
tion on  a  given  individual  consists  of  the  sequence  of  states  that  the 
individual  is  in  at  t  -  0,1, •••,T  namely  i(0),  i(l),  ...,  i(T).   If  the  initial 
state  i(0)  is  given,  then  there  are  mT  possible  sequences.   For  a  lst-order 
Markov  chain,  these  represent  mutually  exclusive  events  with  probabilities 


Pi(0)i(l)...i(T)  "  Pi(0)i<l)  Pi(l)i(2)'"  Pi(T-l)i(T)        (2#1) 

when  the  transition  probabilities  are  stationary. 
If  they  are  not  stationary,  then 

Pi(0)i(l)...i(T)(T)  "  Pi(0)i(l)(1)  Pi(l)i(2)(2)"   Pi(T-l)i(T)(T)' 

Let      nn^t^  "  n0*  of  individuals  in  state  i  at  time  (t-1)  and  i  at 

time  t. 


and  n.  ,ns.  ,.s        .  ,„,,.  be  the  number  of  individuals  whose  sequence  of  states 
i(0)i(l)... i(T) 

is  i(0),  i(l),  ...  i(t).   Then 

ngj(t)  "  2ni(0)i(l)...i(T)  (2'2) 

where  the  sum  is  over  all  values  of  the  i's  with  i(t-l)  -  g  and  i(t)  -  j. 
The  probability,  in  the  nmT  dimensional  space  describing  all  sequences  for 
all  n  individuals  (for  each  initial  state  there  are  nT  dimensions)  of  a  given 
ordered  set  of  sequences  for  the  n  individuals  is: 


TT 
i(0)"'i(T) 


rtvl  ni(0)i(l)...i(T) 

pi(o)i<i)-..i(Tr  J 


l(0)."l(T)f  PU0)i(l)(1)  Pi(Di(2)(2)' 


i  n 
Pi(T-l)i(T)<T)J 


i(0)...i(T) 


/       „  r  ^1  ni(o)i(i)(1)l 


i(0)...i(T)  L  i(0)i<l>  ' 

TT 
i(0)-..i(T) 


(      "      [pi(I1)1(I)(T)]nUW)i(T)(I) 

1  i«»...im  L  i(T"1)l(T)   J 


|ni(0)i(l)(1)^ 


C  r        n,l  i(0)i(l)w\ 

r       _         _  mni(T-i)i(T)(T)] 

li(T-Oi(T)  Pi(T-Di(T)  i 


T  n  .(t) 

-  TT   TT  P  .(t)  gJ    .  (2.3) 

t-1  g,J   SJ 

Therefore,  according  to  "factor  theorem"  (Hogg  and  Craig,  1965)  the  set  of 
numbers  n..(t)  form  a  set  of  sufficient  statistics. 


m 


Let     n  (t-1)  -  I  n  (t). 

j-l 


Then  the  conditional  distribution  of  n. ,(t),  j  -  l,2,...,m,  given  n. (t-1)  is 


n  (t-1)  |     m        n   (t) 

n  n..(t)l  J"1 
j-l  1J 


The  distribution  of  n     (t)    (conditional  on  n   (0))    is 


T    r    m     r       n, (t-1)  I  m  n..(t) 

rr        tt        1 —      tt    p. .  (t)   iJ 

t-11  j-l  L       ■  .     j-l     1J 

tt    n     (t)f 

j-l       J 


If  the  transition  probabilities  are  stationary,  then  the  set 

T 
n.  .  -  £  n. .(t)  can  be  seen  to  be  a  set  of  sufficient  statistics  and  (2.3) 

ij   t-1  w 

can  be  written  in  the  form 


tt   n  p.(t)gJ    -  n  p..  ij.  (2.4) 

t-lg,j  SJ  i,j  U 


Maximum  Likelihood  Estimates 


The  stationary  transition  probabilities  p. .  can  be  estimated  by 
maximizing  the  probability  (2.4)  with  respect  to  the  p..  under  the  conditions 

(1)  Pij-°  S  *■>*   "  l»2»'*',n», 

m 

(2)  £  p. .  -  1  ;  for  all  i,  where  the  n..  are  the  actual 

j-l 

observations. 

For  m  independent  samples,  the  ic^  sample  (i  -  l,2,#,,,m)  consists  of 

* 
n.  -  Z.n   multinomial  trials  with  probabilities  p..  (i,j  -  1,2, •••,m). 
J  *J  ^ 

Then  the  maximum  likelihood  estimates  for  p..  are 

n, .    T  m   T 

P   --it-  S  nijfr)  /  S   L  n  (t) 
1J   n*    t-1   J      k-1  t-1  lk 

T  T-1 

-  Z    n  (t)  /  Z    n  (t). 
t-l   iJ      c-0 


When  the  transition  probabilities  are  not  necessarily  stationary, 
the  maximum  likelihood  estimates  for  the  p. . (t)  are 

a         n,,(t) 

$  (t)  — y 

1J       njCt-1) 


ni.1(t) 


ra 


k-1  1K 


Formally  the  estimates  are  the  same  as  one  would  obtain  if  for  each  i 
and  t  one  had  n. (t-1)  observations  on  a  multinomial  distribution  with 
probabilities  p. . (t)  and  with  resulting  numbers  n..(t). 

The  estimates  can  be  described  in  the  following  way:  Let  the  entries 
n.  .  (t)  for  given  t  be  entered  in  a  two-way  m  x  m  table.  The  estimate  of 
p.  .  (t)  is  the  (i,j)th  entry  in  the  table  divided  by  the  sum  of  the  entries 
in  the  i(th)  row.  To  estimate  p, . ,  for  a  stationary  chain,  add  the  corres- 
ponding entries  in  the  two-way  tables  for  t  -  1,2, •••,T  and  obtain  a  two-way 
table  with  entries  n.  .  ■  Enj.(t).   The  estimate  of  p..  is  the  (i,j)th  entry 
of  the  table  of  n,  's  divided  by  the  sum  of  the  entries  in  the  i(th)  row. 

Asymptotic  Behavior  of  nj. (t) 

Consider  the  following  theorem: 
Theorem:  If  (xj,  ,  ^t*   •••»  xkE»  f-  l»2»***»n)  is  a  sample  of  size  n  from 
the  rautinomial  distribution  M(l  ;  p^,   P2,  •••,  Pk),  then  the  sample  sums 
(zj,  Z2>  •••»  Zj.)  have,  as  their  asymtotic  distribution  for  large  n,  the 
distribution  N^npj),  ||  n(pj  ..  -  Pjp. )  II )  where  §  j.  is  the  Kronecker 
delta.   (Wilks,  1963,  p.259) 

Proof:  The  p.f.  of  the  multinomial  distribution  M(l  ;  p.,  p2,  ••*,  Pk) 
is 


1      O  i   i 

P(X1>  V  -•  V  "  X|X,',.X    ,    Pi   P2  ""W 

xl!  x2i  •,,xk+l.' 


where  xk+1  -  1  -  Xj  -  x2 xk  and  pk+1  -  1  -  ?1  -  p2 p^ 

Then  the  characteristic  function  of  the  multinomial  distribution  is 


fit,  ,     ",  tK)   -  2  eit:lxl   +  ifc2x2   ♦  —    +  itkxk  p0Ci>  ^   ...,  Xk) 


(Pleit:l   )xl   ...(pkeick  )xk    (pk+1   )xk+l 


xi!  •*,xk! 


-  (p^^l   +  p2eic2  +  •••    +  Pkeit:k  +  Pfc+1). 


It  follows  that 

^ (Xj)    -   Pj 
(^(Xj,    Xj)    -  (jjj    -   P^j     -    PjPj 

Then,  using  the  result  (see  wilks,  1963,  P.258) 


lln     P(       *         rX       <y   ,    i  -  1,2, -",k) 
n-»<»  /n  A 


y^  r*   rs'   .  x       u 


^  y   ...  y    exP  (-  -    ^  ^(r  ^.^  ^...d^  ^ 


(2tt) 


it  follows  that 


Czj,  ..-,  zk)~  N(  (np.},    HnCp^ij    -  PiPj)||  ). 


To  find  the  asymptotic  behavior  of  the  p..,  first  consider  the  n..(t). 
For  each  i(0),  the  set  ni(0)i(l)« • • i (T)  are  s*mPly  multinomial  variables 
with  sample  size  n^Q^CO)  and  parameters  Pi(0)i(l)  pi(l)i(2)  ***  pi(T-l)i(T)> 
and  hence  are  asymptotically  normally  distributed  as  the  samples  size  increase. 
The  n,.(t)  are  linear  combinations  of  these  multinomial  variables,  and  are 
also  asymptotically  normally  distributed. 

Let  P  -  [p. .  ]  and  p.,^   be  the  elements  of  the  matrix  P  .  Then  p..' 
is  the  probability  of  state  j  at  time  t  given  state  i  at  time  0.   Let 
n. ,,.(t)  be  the  number  of  sequences  including  state  k  at  time  0,  i  at  time 
(t-1)  and  j  at  time  t.   Then 


m 


nn(t)  "  2  Vii(t)* 

ij      k-1  k,ij 


[t-1] 

The  probability  associated  with  n.  .  .  (t)  -  p        p. 

k  y  i  j       ki       i  j 


with  a  sample  size  of  n.(0).   Thus 


E(  nk;ij(t))    -nk(0)  pki         -pij 

r  i  Ct-iJ        r  [t-ij 

var(  ^ij^j  Dnk(0)  pkj         '  Pijt1  "pki     '      P|j] 

COV  (  nk;ij  (t)»  \;gh(t)  )  -  -V°>  pki     PijV     pgh 

(i,j)  f   (g,h). 


Cons 


ider  nk;ij(t)  -  ^(t-l)  P>  j ,  where  n^Ct-1)  -Zn^.W, 


10 


Then  the  conditional  distribution  of  nk.jj  (t)  given  nk.j  (t-1)  is 
multinomial  with  the  probabilities  p^,..   Thus, 


E(nk;ij  (t)  |  nk;i  (t-l)}mnk;i    ^"^  Pij » 

E{nk;ij    <fc>   -nk;i   (*-»)  Pij} 

-E'E{K;ij    (t)  "nk;i   <fc-«  Pij]  I    nk;i   (c"^} 
-  0   ,    (see  wilks,   1963,   P. 84)  . 

var   [nk;ij    <*>   "  nk;i    ft-V  Pij  J 

-E  k;1J    (t)   -nk;i    (t-1)  PjJ    -  Oj2 

-E'E{[nk;ij    <c)-nk;i    <t-D  Pij]2|  «Hc;i   <t"1>} 
-E.   r^..    (t-1)   Pij(l   -  py) 

-°k  <°>  Pki[t'1]  pu  a  -  Pij), 

cov  (  nk;ij    (t)  -  nk;i   (t-1)  Py  ) 

-  E  [nk;ij    <c>   -  *k;i   (t-1)   Pij]  [nk;ih   (t-1)   -  nk;i    (t-1)  pihJ 


11 


-  E«E 


{foc;ij    <t-»)   "  nk;i    Ct-i)   Pij]  [nk;ih   (t-1)   -  nk;i    (t-1)   plh] 
K;i    «>1>} 


E[-nk;l    (t-1)  PijPih] 


-"V°>  PkiCt"l]  PijPih  ii^. 


Ehc;ij    (*>>  Vi   (fc-«  Pij][nk;gh   <c>  '  *k;g   »-l)  Pgh] 


"  EE{foc;ij    <fc)   -  »k;i    (t-»   Pij][nk;gh   <fc>   -  *k;g   <*-*>   PghJ 


nk;i    (t-1),  nk;g   (t-1) J 


0, 


E[nk;ij    (t)   -  nk;i    (t-1)   Pij][nk;gh   (t#)   -  nk;g   (twr-1)  Pgh] 


-  EE 


{[nk;ij    <c>   "  nk;i    (t"1)   Pij][nk;gh   <t,tf>   -  nk;g   <t+r"D   Pgh] 
nk;ij    <c>»  nk;g   (t*-D.  nk;i    (t-1)  ] 


-  0. 


r^O 


12 


Thus,  the  random  variables  ^.^(t)  -  "k-ii^-1^  pii  for  J  "  1*2*'" 
have  zero  mean  and  variance  and  covariances  of  multinomial  variables  with 
probabilities  p   and  sample  size  nk(0)  p^A   "  J.   The  variables 

nk;ij(t)  "  "k;!0^  Pij  and  nk;gh(s)  "  \;gi8ml)   Pgh  are  ""correlated 


ra 


If  t  *  s,  i  *  g. 


Asymptotic  Distribution  of  the  Estimates 


Consider 


/n~(p 


ij 


PU> 


/n" 


2  n  (t) 
t   J 


2  n.(t-l) 
t-1 


-ft 


2  n^t-l) 


m   T 


fr 


&&["*ii™  '*»"*»**>] 


2  n  (t-1) 
t-1 


Since  r^.j-iCt)  is  a  multinomial  variable  with  probabilities  p   ,  then 
»  J  ij 

n^   (t)/n  Converges  in  probability  to  its  expected  value,  when 


nk(0)/n  nR.   Thus 


1  IT 

p  lim     -  nk;ij    (t)   -  p  lim     -  I       rtj    (t-1) 
n-*K»   n  n-*>°  n  t-1 


1         T 
lim     -  E     L    nj    (t-1) 

n-+oo  n       t-1 


I         T       in 

lim     -  E     L       Z    r^.      (t-1) 
n+to  n       t-1  k-1        * 


1      T         m 
lim     -     I         Z    E  fr^.      (t-1)] 
n-no  n  t-1     k-1  ' 


l     T         m 
lim     -     £         2    r^    (0)   pj. 
n-»<*>  n  t-1     k-1 


m  n     (0>  T  ft-l) 

Z     lim"~7 •      Z     Pkil      J 

k-1    A"*00  t-1 


m  T  .      >, 

k-1     K  t-1    K1 


Then  by  a  convergence  theorem   (see  Cramer,   1946,   p. 254) 


13 


14 


/n~ (Pj j  -  Pjj)  has  the  same  limit  distribution  as 


T 

I  fnn  (t)  -  p..  n.  (t-l)]/nl/2 
t-lL   j (5.1) 

k-1  t-1     J 


Assume  n.     (0)   fixed.      Then  by  arguments  of  the   previous  section  we  have 


T 
E  [    2    [njj    (t)   -  Plj  nt    (t-l)J/n1/2]  -  0, 
t-1 


E[2    niJ.(t)  -  p^  rij   (t-l)]2/n1/2 


-Z2nk   (0)  Pki^'^Py    (l-pij)/n9 


E[2[nij    (t)   -  Pij   nj    (t-1)]  [2  (ngh    (t)   -  Pgh  ng    (t-1)]   /n 


"  "Sig  2  2nk    (0)   Pki^^Pij    Pgh/n, 
k  t  J      b 


where  §.      is  the  kronecker  delta. 
*g 
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Since  /n  (p*,  -  p,,)   has  the  same  distribution  as  (5.1)  the  variables 
l/n  (pjj  -  pjj)  —»•  mean  0,  variance  p...  (1-p^)/^ 

covariance  -6ig  Pjj  Pgh/<?  ]_. 
/n^  (pjj  -  P^j)  -*•  ciean  0,  variance  pi,    (1-pjj) 

covariance  -6ig  Py  pgh 
Jn^  (pjj  -  Pij)  — >   mean  0,  variance  p.j  (l-Pj,) 

covariance  -6ig  Py  pgh 

T-l 
where  n^*   -  £  nj  (t). 
t-0 

(i      » 
(note;  — >     means  "limiting  joint  normal  distributed  as"). 

The  variables  (n^j)  '   (pj.  -  p..)  for  m  different  values  of  i_ 
(i-l,2,"«,  m)  are  asymptotically  independent,  and  hence  have  the  same  limit- 
ing joint  distribution  as  obtained  from  similar  functions  of  the  estimates  of 
multinomial  probabilities  p.  .  from  m  independent  samples  with  sample  size 
njSj,  (i-l,2,««»,  m). 

However,  the  variable  pi.    (t)  -  n^.    (t)/n.  (t-l)  for  a  given  j^  and  £ 
have  the  same  asymptotic  distribution  as  the  estimates  of  multinomial 
probabilities  with  sample  sizes  E  (ni    (t-l),  and  the  variables  p..  (t)  for 
two  different  values  of  i_  or  t_  are  asymptotically  independent. 


16 


TESTS  OF  HYPOTHESES 
Test  of  Hypotheses  about  Specific  Probabilities 

Let  p..  (i,j  -  l,2,,#,,m)  be  given  values  and  consider  the  problem  of 

o  o 

testing  Ho  :  p.  .  -  p.  . ,  j  =■  l,2,#««,m,  for  a  given  i  against  Ha  :  p..  "P.. 
ij    1J  *J    ij  > 

all  i  and  j.   Under  Ho, 


<P,<  -  P?J 


£  n,  U U (3.1.1) 

c 

P. 


j-i  ' 


ij 


2  ,,f  2 

is  asymptotically%n)_1.   Since  n.  (p   -  p  )  for  different  i  are  asymptoti- 
cally independent,  the  forms  (3.1.1)  for  different  i  are  asymptotically 
independent  and  hence  can  be  added  to  obtain  other  %   -variables.   For 

instance,  a  test  for  all  p..  (i,j  ■  1,2, •••,m)  can  be  obtained  by  adding 

o 
(3.1.1)  over  all  i,  which  results  in  a%    -variable  with  m(m-l)  d*f. 


Testing  Ho  that  the  Transition  Probabilities  are  Constant 


To  test  Ho  :  p   (t)  -p..  (t  -  1,2, •••,!),  the  estimates  of  the  transi- 
tion  probabilities  for  time  t  are 

5„<t)--^-(c) 


ij       n.(t-l)  ' 
Then  the  likelihood  function  maximized  under  Ho  is 

n, .<t) 


A 


W  -  "  "   p     iJ 
t  l,j    1J 
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The  likelihood  function  maximized  under  Ha  is 


Lnax-fl.  -  tt  n  Py  (t)  *J 
C  i,j 


nn(t) 


The  familiar  likelihood  ratio  criterion  is 


>s 


nij(t) 


and  -2  log  Pv  is  distributed  as   -%,  (T.lWm-1)  when  Ho  is  true*   (Neyman, 
1949) 

An  mxT  contingency  table  can  be  used  to  represent  the  joint  estimates 
Pi^Ct)  for  a  given  i_  and  for  j_  -  1,  2,  •••,  m,  and  t  -  1,  2,  •••,  T.   Thus 


dependent 


o 

c 

o 
5. 
0) 
■0 

e 


\ 

1              2 

«    •    * 

m 

1 

Piid)  pi2(D 

•  •  • 

Pim<D 

2 

• 

Pn(2)  Pi2<2) 

•  •  • 

Pim<2> 

• 
• 

T 

and  for  each  row  there  are  m  constants  p,,,  p,2,  •••,  p,  with  £  p.  .  -  1 
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2 
The  ytj-- test  of  homogeneity  is  given  by 


and*)^  has  (m-l)(T-l)  d.f. ,  where  i  -  1,  2,  •••,  m. 

Another  test  of  the  hypothesis  that  the  transition  probabilities  are 
constant  for  T  independent  samples  from  multinomial  trials  can  be  obtained 
by  using  the  likelihood  ratio  criterion.   Thus 


X;  -  tt  [Pii/PiiCt)]  nij<t)  .  (3.2.2) 

t,j    J   J 


The  asymptotic  distribution  of  -2  logA.   isOt2  with  (m-l)(T-l)  d.f., 
since  it  is  related  to  the  contingency  table  approach  dealt  with  for  a  given 
i_.   Hence,  Ho  can  be  tested  separately  for  each  value  of  i. 

Consider  the  joint  hypothesis  that  Pij (t)  -  P^  for  all  i,  j  -  1,  2,  •••, 

m,  t  -  1,  2,  ••*,  T.  A  test  of  this  joint  Ho  can  obtained  from  P«.:(t)  and 

2 
Pj.  directly  since  the*>C,s  a*"e  asymptotically  independent.   Hence 


*m(m-l)(T-l)  -  2  v\   -  Z  Z    14  (t-l)(p  (t)  -  £ ]2  /$         (3.3.3) 

i-1      i  t,j  J 


and  the  test  criterion  based  on  (3.2.2)  is 
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m 


2  -  21ogX.  -  -21og>w 
i-1 


Test  of  the  Hypothesis  that  the  Chain  is  of  a  Given  Order 

A  lst-order  stationary  chain  is  a  special  2nd-order  chain,  for  which 

p. .,  (t)  does  not  depend  on  i.   Thus  a  2nd -order  chain  can  be  represented  as 
ljk 

a  more  complicated  lst-order  chain.   To  do  this,  let  the  pair  of  successive 
states  i  and  j  define  a  composite  state  (i,j).   Then,  the  probability  of 
the  composite  state  (j  ,k)  at  t  given  the  composite  state  (i,j)  at  t-1  is 
P..,  (t).   The  probability  of  state  (h,k),  h  «  j ,  given  (i,j),  is  zero. 
Assume  n. (0)  and  n. . (1)  are  nonrandom.   Consider  the  set  n...  (t) 
(i,  j,  k  -  1,  2,  •••  ,  m;  t  =  2,  3,  •••,  T).   The  conditional  distribution 
of  n-jk(t),  given  n^  (t-1),  is 


n.j(t-l)|     m      nijk(t) 
"knijk  <«=>.'    ki  Pijk 


where  n..(t-l)  -  En,..(t). 
ij       k  UK 


The  joint  distribution  of  n^Ct)  for  i,  j  ,  k  »  1,  2,  •  ••  ,  m   and 


t  -  2,  3,  •..,  T  is 
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T   ■     n   (t-l)l  n..k(t) 

TT    TT   U TT  piik  IJk- 

t-2  i,j   TTn.k  (t)   k-l  ^ 

k 


The  maximum  likelihood  estimate  of  P£jk  for  stationary  chains  is 


2  nnv(t) 
$        nijk   _  t-2  ijk 


ijk    m  T 

2  nin      2  ntj  (t-1)   . 
t-1   J      t-2 


Consider  Ho  :  Pj-  -  p2jk  -  •••  -  Pjk»  for  all  j ,  k  -  1,  2,  •  •  •  ,  m. 

The  likelihood  criterion  for  testing  this  hypothesis  is 


A  -    TT   (pjk  /pijk)  iJ1C  (3.3.1) 


where  £jk  -In^  /LZn^ 


T-1 


-     2    ".,  (t)   /     2    n.  (t)   . 
t-2     J  t-1     J 


7  2 

Under  Ho,   -2  logA    is  asymptotically *Xr  m(m-l)   . 
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1/2  A 
In  contingency  tables,  for  a  given  j,  the  n    (p.  ..  -  p...)  have  the 

same  asymptotic  distribution  as  the  estimates  of  multinomial  probabilities 

for  m  independent  samples  (i  ■»  1,  2,  •••,  m).  An  mxm  table  can  be  used  to 

represent  P^^  for  a  given  j  and  for  i>  k  -  1,  2,  •••,  m.  To  test 


Ho  :  Pj.jj  -  p.fc     for  i  -  1,  2,  •••,  m,  we  have 


J     1  ^K 


where  nV,   -  £  n,  ..    -  I     £    n,  ..     (t) 

Xj        k     ijk       k  t-2      ijk 


T  T-l 

-     Z    n,,    (t-l)   -     2    n-.Ct) 
t-2       J  t-l       J 


If  Ho  is  true,^  has  the  usual  limiting  distribution  with  (m-1)  d.f. 

For  the  use  of  the  likelihood  ratio  criterion  to  test  Ho,  we  calculate 


A.'-  "  tfjk  /  Pijk)nijk 
i,k 


and  -2  log  A..  ls%2  ^.1j2. 
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Consider  the  joint  hypothesis  that  p.  ..  -  p.,  for  all  i,  j,  k  -  1,  2,,,>, 

1JK      JK 

m.      To  test  this  joint  hypothesis,  compute  the  sum 


m       „  „ 

%2m(m-l)2  -     Z  >)  -       I       nfJ*    (Pijk  -  Pjk)2  /  Pjk  . 
j  "■!  1  ,j  ,k 


The  test  criterion  based  on  (3.3.1)  can  be  written 


m 


Z  -  2   log  A  .  -  -2  log> 


j 


-  2 


J^j  J10*  >ijk  "  l0*  *j  J 


Consider  Ho  :  Pij---ki  "  Pj,„.kl  for  *  "  X»  2»  '**»  m»  that  is»  test 
the  hypothesis  that  a  chain  is  of  order  r-1  against  the  alternative  that  it 
is  of  order  r.   For  this  Ho  let  n'ij...ki  (t:)  be  the  states-  i,  j,  •••,  k,  1  at 
times  t-r,  t-r+1,  •••,  t-1,  t  respectively,  and  ni:j...kl  (t-1)  -  2^.  ..kl(t), 
Assume  here  that  the  ni....k  (r-1)  are  nonrandom.   The  maximum  likelihood 
estimate  of  Pi,...kl  is 


* 

pij*"kl  "  nij'"kl  '  nij...k 
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T 

where      n, .   ,.  =  2  n,.   ..  (t)  and 
ij« • »kl        Ij»  •  «kl 
J       t»r 


*  T 

nij...k  -fnij...kl  "  2  "ij.-.k^1) 
J      1  t=r 


2  ,  nij...k(t)- 
t-r-1 


For  a  given  set  j,  •••,  k,  the  set  Pi-i . . .id  will  have  the  same  asymptotic 
distribution  as  estimates  of  multinomial  probabilities  for  m  independent 
samples  (i  ■*  1,  2,  •••,  m) ,  and  can  represented  by  an  mxm  table.   Thus  to  test 
Ho  :  p.,   ,  .  -  p.   .  .  (for  i  ■  1,  2,  ••• ,  m)  is  true,  we  have 


xj...k  "  L  nij...k  ^ij.-kl  "  pj...kl;  /  pj.-kl 
1»1 


^  .    * 

where       p.   ,  ,  «  2  n ,  .   ,  ,  /  2  n .  . 

rj...kl   T1   ij...kl   ~   ij«»'k 


T-l 


-  2  "j.-.kl^  '       2   nj...k(t>« 
t-r  J  t«r-l 


0  0  t 

TheXj.-.fc  has  (m-1)  d.f..   Since  there  are  m    sets  j,  ...,  k 
(j  ■  1»  •••»  m;  •••;  k  -.  1,  2,  •••,  m)  then 


24 


^     -«.     ±A  v 

total    J>'",k  j««.k 


r-1     2 
will  have  m   (m-1)  d.f.  under  the  joint  null  hypothesis.   One  could  use 

the  likelihood  ratio  criterion 


TT    /*  I   S      vnij---kl 


J",k   i,---,l   J'"k    ij'"! 


where  -21ogX.   ,  is  distributed  asymptotic  asCt^  with  (m-1)  d.f.  as  a 

J  •  •  •  K 

basis  for  testing  Ho.   Also 


2  f  -21og  X.        t)  -  2   I    n.  .   ,  .  log  (p      /  p      ) 
j...kl       *'"*'  i-.kl  1J,"kl       ij'.-l    jv.-kl' 


2  r-1      2 

has  a  limiting  £  -distribution  with  m   (m-1)  d.f.  when  the  joint  Ho  is  true. 

2 
%   -tests  and  Likelihood  Criterion 

The  following  development  for  testing  certain  hypothesis  about  single 
chains  is  due  to  Bartlett  (1951).   Consider  the  observed  sequence 
xl»  x2»  *"»  'Si-l*  xn#   The  Probat|ility  of  this  sequence  s  is 


25 


p(sj  -  p[XlJ  p[x2fXlj   p(x3|x1,  x2j.-    p{xk|x1,  x2>   .-,  xk-1j 


n-k 


Al  —  IS.  y- 


The  variable  x  can  take  s  values  as  the  states  1,  2,  ••*,  s  and  hence  a 
subsequence  xh,  xh+1,  •••,  xk+hiipl,  xk+h  can  take  Sk+1  values.   Let  the 
frequency  of  length  k+1  be  njj...   .   Let  n^- ...qr  ■  n^  and  pur 
»  pj  xr|xj,  •••,  x_  J  Then  (4.1)  may  be  written 


L  -  log  i>ls]  -  2  log  pjxjjx!,  x2,  •••,  *j_i}  +  2  i\,r  log  pur. 
'       j-1      ^  u,r 

(4.2) 


If  n  increases,   then  Zy  -n™  log  pur  will  become  the  dominant  part  of 

log  p[sj.      If  n  is  large  enough  then  under  the  condition  L  pur  -  1,    it  follows 

r 
that 


nij"#qr  "ur 


Pur ~ ~ (4.3) 

nij"'q  ^ 


where  ti^  -  1^.  n^.      Hence  the   likelihood  criterion  becomes 
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X  "  -2[L  -  WxJ  "  "2  [L    nur  log  pur  -     I     n^  log  pur] 


u.r 


u,r 


-  -2     L    nur  log    (pur  r^  /  nur) 


u.r 


ni  n 

-2     L    nur  log    (  -^  '   -"-  ) 
u,r  npu       ^r 


--2  i  ^rl„gr^.2,i 

u.r  L  nur       ^  J 


(    2    nur  log   (  J^L  )   .  z  ^  iog  2H.  , 
^u,r 


"Kar 


"u 


(4.4) 


where  m^  -  nPur  -  nPuPur>  n^  «  npu,  pur  pu  denote  absolute   probabilities  of 
the    'values*    (u,r)  and  u. 

In  the  case   of  k-0,   we   have    independence   and    (4.2)   becomes 


L  -  Zj.  nr  log  pr 


and    (4.4)   becomes 


A  "  2  2    n     log    (n     /  m  ) 
rr      °r         r 


(4.5) 


The  expression   (4.5)    is  the   likelihood  criterion  and   is  asymptotically  a  "X2 
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distribution  with  (£-1)  d.f.  Alternatively,  one  could  use 

r    "y 

to  test  Ho  :  P«j...kx  -  Pj"«ki  (i  "  l>   2»  "•»  s)« 
Hoel  (1954)  has  given  another  approach. 
Let 


ns ... 


L  -   tt    Pij...kl  iJ*"kl 
i  ,•  •  •  1 


where  i,  j,  •••,  k,  1  -  1,  2,  •••,  m.   Corresponding  to  m  possible  states, 
let  nij...k2,  denote  the  frequency  of  the  r-order  chain  state  ij«"kl  for  r+1 
subscripts.  Then,  the  maximum-likelihood  estimate  of  Pij...kl  is 


<v  nii*,»kl 

pij'"ki  -—' 

nij*"k 


where     nij«««k  "  ^nij»**kl* 


_  .       ^ '  a         ni,,,i.i 

Let       pij*"kl  "  pj"*kl  "  — ~  under  Ho. 

nj#..k 
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The  likelihood  ratio  for  testing  Ho  is 


lo   (max)     L  (Pii#,,kl) 
\  m   -  _J 

L  (max)      L  (Pjj...kl) 


Assume  the  nii»».ici  are  asymptotic  normally  distribution. 


1   1 

j    -7  (n-u)A(n-u)' 

Hence       L  ~  |A|*  £ 


where  (n-u)  denotes  the  row  vector  of  the  linearly  independent  variables 

"ij'"kl  "  uij*,#kl»  where  E(nij*'*kl)  "  uij,#,kl  and  A  is  Positive  definite 
matrix.  Then 


X  /\f 


1  a   »    ^    .       a   . 

A     ~      4    (n-u«)   A'(  n-u')' 

|A«i2e2 

1  1         .  A  A  A 

|A|2€"2    <n"U>   A   <n"U> 


/\    -A       .   A.     A 


where  u,  A  .and  u',  A'  indicate  that  the  parameters  Pi  ^ . . .ki  have  been 
replaced  by  Pij...^!  and  P'ij«««kl»  respectively.   Now 


jA  j 
-2  log  X   is  approximately  log  -tt^-  +  fn-u]  A'  [n-u']1  -  [n-u]  A  (  n-uj  » 


when  Ho  is  true.   Then  |A  |  and  | A* |  converges  stochastically  to  the  same  value, 
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Kence 


-2  log  X^  [n-"']  A'  [n-u'J  •  -  [n-uj  A  [n-u]. 


Since        uij«»«kl  "E  <nij«'«kl>  "nPij"«k  Pij«««kl 


where  Pj,-«..k  denotes  the  absolute  probability  of  obtaining  the  r-1  chain 
state  ij«»»k,  then 


uij»"kl  "  npij»"k  Pij»»»kl» 

Since   Pw..^   is  some   function  of  the  transition  probabilities  Pij..»ki»    it: 
can  be  written  as 

?ij...k  -8    (Pij«"kl> 
and  •      ?ij...k  -  g    (Pij...kl)» 


nij'"k 


Assume  that     g    (p. ,...,_,)   -  J- 


then  J4j—kl  "  n*   — Lj •   -^ —  "  nij...kl 

n  nij"«k 
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Therefore     -2  log  \   'n  [n-u»]  A»  [n-u1].  (4.6) 


The  right  hand  side  of  (4.6)  is  a  quadratic  form  and  is  distributed  as 


V-l  (s-1)2 


2 
and  hence   -2  log  X<-\-x  r_i  (     .  »2 


2 
Anderson  and  Goodman  (1957)  have  the  following  approach  to'XT-tests  and 

the  likelihood  ratio  criterion.   Consider  the  distribution  of  the  Oi. - 

statistics  (3.3.3)  under  Ho  :  p^  (t)  -  p..  for  all  i,  J  -  1,  2,  •••,  m. 

t  =  1,  2,  •••,  T.   Since  /n"  (p  —  (t)  -  Py)  are  asymptotically  normally 

distributed  with  mean  zero  and  variance  p..  (l-p*.)  /  m.  (t-1),  etc.,  where 


r  ni<fc>  i 
E  [— i ]  -  mi(t) 


then  for  different  t_  or  different   i_,   they  are  asymptotically   independent. 
Then  [nm.    (t-l)J  l/2    [Pij  (t)   -  p^j^^o,   Pij    (1-p^)],  etc., 


Let  p^    -  ^  m.(t-l)   Pij(t)    /  £,.  m.(t-l).       Then 


*■  *i,  /v/2 


Ztnmi(t-1)  [Pij(t)   -  p'ijj^'X.2       under  \ 


Ho. 
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But 


<*•-;. 


p  lim  ptj    -  p  lim  Zj.  mjCt-1)   Py  (t)   /  Zj-  m^t-l) 


P  lim  Zj.  nt(t-l)   Pij(t)   /  Zj.  n^t-1) 


p  lim  Zt  nj(t)   /  Zt  n^t-l) 


-Pij 


ni(t) 

and  p  lim   (  — m   (t))   -  0. 

n 


_  m.  (t-1)    (pn(t)   -  pV  )2 

Therefore  p  lira  fn  L  — - -ii- ^ 1 

Pij 


n^t-1)    (p^Ct)   -g.-)2 

2c  ^ J 

pij 


Hence,   the  ^-statistics  has  the  same  asymptotic  distribution  as 
Z  nniiCt-DfpijCt)   -  Pi j j      ;   that   is,  a  ^-distribution. 
Next,  consider  that  for  |xj    1/2, 


2         3         4 
(l«x)   log    (1+x)   -   (Hx)    (x--     +  -     -  -     +  ...    ) 

2         3         4 


c2 
"  x  +  -     -    (x-V6)    (1   -  -  +  •••    ) 
2  2 


*'        -J/-.      -       x 
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2 
and  J    (1   +  x)   log    (1  +x)   -  x  -  -  )|    -    |  (x3/6)    (1  -   -   +  •••    )|^|x3 


since  X4  -     TT   [  Pij    /  Pij(t)]nij(t)   , 


/\  .    s* 


Therefore         -2  log  Xj  -  -2     2    nj, (t)   log  p.,    /  pj. (t) 

t,j 


2     2    njCt-l)  p^   log[pi:J(t)   /pV] 
J  »t 


2     2    n.(t-l)  p..[l   +  x..(t)]  log[l   +xl1(t)] 
t,j 


where  xij(t)   "[Pij(t)   "  Pij]/  pij* 


Then  A    -  -2  log  Xj  -  x? 


-2     2    n.(t-l)  p^  (l  +xij<t)|   log[l   +  xlj(t)]    -  JiliiHLj 

-2     2    n.(t-l)   pV^fl  +Xij(t)]log  [l   +x.j(t)]  -  Xlj(t) 
j  »t 


(xtj(t)): 

2 


) 


since  2j    Py  xtj  (t)   -  2j    (p^-O:)   -  p^j)   -  0 
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for  any  £  >  0 ,  we  have 


Pr(iA|<£)^Pr(|A|<e    and|x..(t)|<i    j 

$P    f  2     2     n.  (t-1)   p3.[x..(t)]3<gand     x..(t)|<-| 


^Pr(2n    -Z      tXiJ(t)|3N<£andlXiJ(t)'s) 


since  p  lim  x. .  (t)   »  p   lim     p      (t)    -   p  /  p        -  0. 

XJ  ij  lj  ij 


Therefore  p  lim  n[x..  (t)]3   -  p  lim  [(x..(t)  n)  •   x      (t)]' 


/ (     P-j(t)    -   p..  p..    -   p,  .  "\ 

P  lim^/x      (t)  n    j  -iH-L V-±2 ilu 1U 


'u  m 


Hence     p  ( |  A  |  <  g  j-  0  and  -21og  XpCtJ  . 


Application  and  Example 

To  illustrate  the  usefulness  of  the  theoretical  results  discussed  in  the 
previous  sections,  we  consider  an  example  from  climatology  (Feyerherm  and 
Bark,  1964).   Consider  the  problem  of  testing  hypothesis  concerning  the  order 
of  a  Markov  chain  composed  of  a  sequence  of  wet  and  dry  days.   We  assume  that 
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p..(t)  »  p- • ,  t  =  1,  2,  •••,  T,  if  T  is  less  than  41  days  and  that  successive 
years  can  be  considered  as  repeated  observations  on  the  same  chain. 

The  test  statistic  is  easy  to  compute  from  ordinary  contingency  tables 
which  show  observed  numbers  for  various  cells  of  the  table.   Data  for 
Manhattan,  Kansas  for  the  40-days  period  begining  on  the  7th  day  and  ending 
with  the  46th  day  of  the  year  were  as  shown  in  Table  1-4,  where  the  states 
are  taken  to  be  D  (dry  day)  and  W  (wet  day). 


Table  1.   Observed  values  for  testing 


Ho  :  pjk  =  p^   vs   Ha  :  pjk  f  i^,  j,k  -  D,W 


k-D 

k-W 

j-o 

1799 

262 

2061 

j-W 

261 

118 

379 

2060 

380 

2440 

y(   -  82.631 


Table  2.   Observed  values  for  testing 

Ho  :  pijk  "  pjk   vs   Ha  :  pijk  *  pjk'  l»J»k  -  D'U- 
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\  t-1 

t-2\ 

j 

k-D 

■D 

k-W 

i-D 

1579 

229 

1808 

i-w 

220 

33 

253 

1799 

262 

2061 

\  t-1 

j 

-W 

\^  t 

t-2V 

k-D 

k-W 

i-D 

178 

83 

261 

i-W 

83 

35 

118 

261 

118 

379 

*D,1  "  -°285 


<1"-1735 


2    2 

X~   +   X,T 

D    W 


.2020 


From  Table  1,  the  hypothesis  of  independence  (chain  is  of  zero  order) 
is  firmly  rejected.   For  the  hypothesis  that  the  chain  is  of  order  two 
rather  than  one  (table  2)  three  y?-   values  were  computed.  The  first  is  one 
for  sequences  in  which  the  middle  day  (  (t-l)st  day)  was  dry,  the  second  for 
sequences  in  which  the  middle  day  was  wet  and  the  third  is  the  sum  of  the 
first  two.   They  provide  asymptotically  independent  tests  of  the  same  Ho# 
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which  could  be  stated  alternatively  as:  Given  the  weather  (D  or  W)  on  the 
middle  day  of  a  three-day  sequences;  the  weather  (D  or  W)  on  the  3rd  day  is 
independent  of  the  weather  (D  or  W)  on  the  1st  day.  Evidence  for  rejecting 
Ho  is  insufficient. 

Before  stating  that  we  are  dealing  with  a  first-order  chain  we  might 
look  at  some  other  hypothesis. 


Table  3.   Observed  values  for  testing 

Ho  :  pijkl  -  pjkl   vs   Ha  :  pijkl  +   pjkl>  i,j,k,l  -  D,W 


t-2, 

t 

-1 

t 

j»* 

-  D,D 

t-3^ 

1-D 

1-W 

i-D 

1389 

190 

1597 

i-W 

200 

29 

229 

1589 

219 

1808 

t-2,  t-1 

j,k 

-  D,W 

"\^^   t 

t-3\\ 

1=D 

1-W 

i-D 

158 

20 

178 

i-w 

71 

12 

83 

229 

32 

261 

*to,l  -  -075 


*ta,l  -  -546 
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t-2,  t-1 

j,k 

-  W,D 

"N.    t 

t-3^\ 

1-D 

1-W 

i-D 

155 

65 

220 

1-W 

18 

15 

33 

173 

80 

253 

.t-2,  t-1 

j,k 

-  W,W 

\.    t 

t-3N. 

1-D 

1-W 

i-D 

59 

24 

83 

i-W 

26 

9 

35 

85 

33 

118 

0C     -  3.359 
WD,1 


*ta.i "  -125 


*4  "*DD,1  ^DH.l   ^.l   +  *WW,l   -4-105 


Alternatively  the  hypothesis  in  Table  3  can  be  stated:  Given  the  weather 
(DD,  DW,  WD,  or  WW)  on  the  middle  2-days  of  a  4-day  sequence,  the  weather 
(D  or  W)  on  the  4th  day  is  independent  of  the  weather  (D  or  W)  on  the  1st 
day.  Again,  there  is  no  basis  for  rejecting  Ho. 


Table  4.   Observed  values  for  testing 

Ho  :  p  .  .,  ,  ■  p,  ,   vs   Ha  :  r   ,  J4  p,  .,  i,j  >k,l  -  D,W 
ljkl    kl  ^jkl     kl 
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t-3,  t- 

t-l 

"V     t 

2\ 

k- 
1-D 

'D 

1-W 

i , j  -DD 

1389 

200 

1589 

i,j=DW 

155 

18 

173 

i,j-WD 

190 

29 

219 

i,j-WW 

65 

15 

80 

1799 

262 

2061 

\  t-l 
\  t 
t-3,  t>2\ 

k-W 
1-D  1-W 

l,j-DD 

158   71 

229 

l,j-DW 

59   26 

85 

i,j-WD 

20   12 

32 

i,j-WW 

24    9 

33 

261  118 

379 

\,1   "  3-"6 


06  „  -  .848 
W,3 


2    2      2 
6    D,3    W,3 


-  4.384 
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Alternatively,  the  hypothesis  in  Table  4  can  be  stated:  Given  the 
weather  (D  or  W)  on  the  3rd  day  of  a  4-day  sequence,  the  weather  (D  or  W) 
on  the  4th-day  is  independent  of  the  weather  (D  or  W)  on  the  1st  two  days. 
Again,  there  is  no  basis  for  rejecting  Ho.  The  results  in  Tables  (1-4) 
indicate  that  a  first  order  chain  represents  a  good  approximation  for 
describing  the  dependence  in  a  sequence  of  wet  and  dry  days. 


40 


ACKNOWLEDGMENT 

The  writer  wishes  to  express  his  sincere  appreciation  to  his  major 
professor,  Dr.  Arlin  M.  Feyerherm,  who  gave  helpful  suggestions  and  assistance 
during  the  preparation  of  this  report  and  to  Dr.  H.  C.  Fryer  and  all  the 
faculty  members  in  the  Department  of  Statistics  for  their  continuous  en- 
couragement and  instruction  through  his  graduate  study. 


42 


REFERENCES 


Anderson,  T.  W. ,  "Probability  models  for  analyzing  times  changes  in  atti- 
tudes", Mathematical  Thinking  in  the  Social  Sciences,  Illinois.  (1954) 

Anderson,  T.  W.  and  Goodman,  L.  A.,  "Statistical  inference  about  Markov 
chains",  Ann.  Hath.  Stat.  ,  28:  89-109.   (1957) 

Bartlett,  M.  S.  ,  "The  frequency  goodness  of  fit  test  for  probability  chains", 
Proc.  Cambridge  Soc. ,  47:  86-95.   (1951) 

Cramer,  H. ,  Mathematical  Methods  of  Statistics,  Princeton  Univ.  Press, 
Princeton.   (1946) 

Feller,  W.  ,  An  Introduction  to  Probability  Theory  and  its  Applications,  John 
Wiley  and  Sons  Inc.   (1957) 

Feyerherm,  A.  M.  and  Bark,  L.  D. ,  "Statistical  Methods  for  Persistant 

Precipitation  Patterns'1,  Paper  presented  at  Sixth  National  conference  on 
Agricultural  Meterology,  October  8-10.   (1964) 

Gabriel,  K.  R. ,  and  Neumann,  J. ,  On  a  distribution  of  weather  cycles  by  length. 
Quart.  J.  R.  Keteor.  Soc. ,  83:  375-380.   (1957) 

Good,  I.  J.,  "The  likelihood  ratio  test  for  Markov  chain",  Biometrika,  42:  531-3. 
(1955)  ;  Corrigenda,  44:301.   (1957) 

Hoel,  P.  G. ,  "A  test  for  Markov  chains",  Biometrika,  41:  430-433.   (1954) 

Hogg,  R.  V.  and  Craig,  A.  T. ,  Introduction  to  Mathematical  Statistics, 
2nd  Ed.,  The  Macmillon  Company,  New  York.   (1965) 

2 
Neyman,  J.,  "Contribution  to  the  theory  of  the  x  -test",  Proceedings  of  the 

Berkeley  Symposium  on  Math.  Stat,  and  Probability,  Univ.  of  Cali.  Press, 

Berkeley,  239-274.   (1949) 

Parzen,  E. ,  Stochastic  Process,  Holden-Day  Inc.   (1962) 

tfilks,  S.  S. ,  Mathematical  Statistics,  John  Wiley  and  Sons,  Inc.   (1963) 


STATISTICAL  INFERENCE  ABOUT  MARKOV  CHAINS 


»y 


SAI-SING  LIN 
B.  Sc. ,  Taiwan  Normal  University,  1958 


AN  ABSTRACT  OF  A  MASTER'S  REPORT 

submitted  in  partial  fulfillment  of  the 

requirement  for  the  degree 

MASTER  OF  SCIENCE 

Department  of  Statistics 

KANSAS  STATE  UNIVERSITY 
Manhattan,  Kansas 


1966 


The  determination  of  limiting  distribution  functions  of  certain  functions 
of  n  random  variables  as  n-»°o  is  an  important  class  of  problems  in  mathe- 
matical statistics. 

In  Markov  chains,  the  maximum  likelihood  estimates  and  their  asymptotic 
distribution  are  obtained  for  the  transition  probabilities  in  a  chain  of 

arbitrary  order  when  there  are  repeated  observation  of  the  same  chain. 

2 
Likelihood  ratio  tests  and  :£  -tests  of  the  form  used  in  contingency 

tables  are  obtained  for  testing  the  following  hypotheses:  (a)  P  is  station- 
ary (i.e.  ;  ?  =  P  =  /  p   |)  against  the  alternative  that  it  varies  over  time, 
(b)  P  is  a  given  matrix  against  the  alternative  that  it  is  not,  (c)  the 
process  is  a  u  n  order  Markov  chain  against  the  alternative  it  is  r   but 
not  u   order.   In  case  u=0  and  r=l,  case  (c)  results  in  tests  of  the  null 
hypothesis  that  observations  at  successive  time  points  are  statistically  in- 
dependent against  the  alternate  hypothesis  that  observation  are  from  a  first 
order  Markov  chain. 

There  is  some  disscusion  of  the  relation  between  the  likelihood  ratio 

2 
criterion  and  X   -tests  of  the  form  used  in  contingency  tables.  An  example 

which  shows  the  usefulness  of  the  theory  is  given. 


